Overview

Dataset statistics

Number of variables10
Number of observations1000
Missing cells17
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory78.2 KiB
Average record size in memory80.1 B

Variable types

Categorical4
Numeric6

Alerts

feat.e is highly correlated with feat.iHigh correlation
feat.f is highly correlated with responseHigh correlation
feat.i is highly correlated with feat.eHigh correlation
response is highly correlated with feat.fHigh correlation
feat.a has unique values Unique
feat.e has unique values Unique
feat.f has unique values Unique
feat.h has unique values Unique
feat.i has unique values Unique

Reproduction

Analysis started2022-11-22 19:51:09.936592
Analysis finished2022-11-22 19:51:15.893089
Duration5.96 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

response
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
1
553 
0
447 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1553
55.3%
0447
44.7%

Length

2022-11-22T14:51:15.953089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-22T14:51:16.018589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
1553
55.3%
0447
44.7%

Most occurring characters

ValueCountFrequency (%)
1553
55.3%
0447
44.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1553
55.3%
0447
44.7%

Most occurring scripts

ValueCountFrequency (%)
Common1000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1553
55.3%
0447
44.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1553
55.3%
0447
44.7%

feat.a
Real number (ℝ)

UNIQUE

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.048383598
Minimum-7.429324037
Maximum10.7231198
Zeros0
Zeros (%)0.0%
Negative353
Negative (%)35.3%
Memory size7.9 KiB
2022-11-22T14:51:16.091092image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-7.429324037
5-th percentile-3.867752929
Q1-0.8849727281
median1.027628916
Q32.9938056
95-th percentile6.028401614
Maximum10.7231198
Range18.15244384
Interquartile range (IQR)3.878778328

Descriptive statistics

Standard deviation2.975084929
Coefficient of variation (CV)2.837782788
Kurtosis-0.0686019667
Mean1.048383598
Median Absolute Deviation (MAD)1.95091297
Skewness0.06539204332
Sum1048.383598
Variance8.851130336
MonotonicityNot monotonic
2022-11-22T14:51:16.174089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.68142693971
 
0.1%
0.11771401851
 
0.1%
3.3071568851
 
0.1%
1.3621579881
 
0.1%
3.5909453021
 
0.1%
5.1415435851
 
0.1%
6.8987440461
 
0.1%
0.91481483581
 
0.1%
-5.7471532711
 
0.1%
1.0945780141
 
0.1%
Other values (990)990
99.0%
ValueCountFrequency (%)
-7.4293240371
0.1%
-6.9827683951
0.1%
-6.9294468561
0.1%
-6.8050990111
0.1%
-6.5237534071
0.1%
-6.3976945811
0.1%
-5.9275066271
0.1%
-5.7471532711
0.1%
-5.6749630891
0.1%
-5.6318993321
0.1%
ValueCountFrequency (%)
10.72311981
0.1%
9.075142011
0.1%
9.0545769981
0.1%
8.7263492911
0.1%
8.7143744381
0.1%
8.659078341
0.1%
8.4639936321
0.1%
8.3741814761
0.1%
8.2906799571
0.1%
8.2503200611
0.1%

feat.b
Real number (ℝ)

Distinct992
Distinct (%)100.0%
Missing8
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean-3.941515104
Minimum-8.571791335
Maximum1.085556232
Zeros0
Zeros (%)0.0%
Negative987
Negative (%)98.7%
Memory size7.9 KiB
2022-11-22T14:51:16.397589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-8.571791335
5-th percentile-6.50370201
Q1-4.982480891
median-3.917721434
Q3-2.883325853
95-th percentile-1.59840455
Maximum1.085556232
Range9.657347567
Interquartile range (IQR)2.099155038

Descriptive statistics

Standard deviation1.512668971
Coefficient of variation (CV)-0.3837785551
Kurtosis-0.08378414308
Mean-3.941515104
Median Absolute Deviation (MAD)1.059235767
Skewness-0.02269323309
Sum-3909.982983
Variance2.288167417
MonotonicityNot monotonic
2022-11-22T14:51:16.488088image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-5.4936980871
 
0.1%
-2.4811942091
 
0.1%
-2.6678894491
 
0.1%
-6.9099102321
 
0.1%
-2.4651973671
 
0.1%
-3.991813411
 
0.1%
-3.1453315461
 
0.1%
-6.4798833451
 
0.1%
-4.999981571
 
0.1%
-4.6723512841
 
0.1%
Other values (982)982
98.2%
(Missing)8
 
0.8%
ValueCountFrequency (%)
-8.5717913351
0.1%
-8.0429940541
0.1%
-7.9439881621
0.1%
-7.9060572561
0.1%
-7.8240141621
0.1%
-7.6938625251
0.1%
-7.5661103881
0.1%
-7.5039209981
0.1%
-7.4706036611
0.1%
-7.3765677631
0.1%
ValueCountFrequency (%)
1.0855562321
0.1%
0.9357761651
0.1%
0.77606671111
0.1%
0.2241264141
0.1%
0.19608672041
0.1%
-0.13409785571
0.1%
-0.2818812981
0.1%
-0.32800298951
0.1%
-0.36326628831
0.1%
-0.37568894031
0.1%

feat.c
Categorical

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
b
278 
d
255 
a
240 
c
227 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowb
2nd rowd
3rd rowb
4th rowa
5th rowc

Common Values

ValueCountFrequency (%)
b278
27.8%
d255
25.5%
a240
24.0%
c227
22.7%

Length

2022-11-22T14:51:16.562089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-22T14:51:16.627090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
b278
27.8%
d255
25.5%
a240
24.0%
c227
22.7%

Most occurring characters

ValueCountFrequency (%)
b278
27.8%
d255
25.5%
a240
24.0%
c227
22.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
b278
27.8%
d255
25.5%
a240
24.0%
c227
22.7%

Most occurring scripts

ValueCountFrequency (%)
Latin1000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
b278
27.8%
d255
25.5%
a240
24.0%
c227
22.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
b278
27.8%
d255
25.5%
a240
24.0%
c227
22.7%

feat.d
Categorical

Distinct2
Distinct (%)0.2%
Missing9
Missing (%)0.9%
Memory size7.9 KiB
1.0
511 
0.0
480 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2973
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0511
51.1%
0.0480
48.0%
(Missing)9
 
0.9%

Length

2022-11-22T14:51:16.685590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-22T14:51:16.751590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0511
51.6%
0.0480
48.4%

Most occurring characters

ValueCountFrequency (%)
01471
49.5%
.991
33.3%
1511
 
17.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1982
66.7%
Other Punctuation991
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01471
74.2%
1511
 
25.8%
Other Punctuation
ValueCountFrequency (%)
.991
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01471
49.5%
.991
33.3%
1511
 
17.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII2973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01471
49.5%
.991
33.3%
1511
 
17.2%

feat.e
Real number (ℝ)

HIGH CORRELATION
UNIQUE

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.5183207529
Minimum-6.758176383
Maximum5.289708782
Zeros0
Zeros (%)0.0%
Negative596
Negative (%)59.6%
Memory size7.9 KiB
2022-11-22T14:51:16.817091image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-6.758176383
5-th percentile-3.840913833
Q1-1.779296264
median-0.5163815105
Q30.8010775699
95-th percentile2.774485945
Maximum5.289708782
Range12.04788516
Interquartile range (IQR)2.580373833

Descriptive statistics

Standard deviation1.984703368
Coefficient of variation (CV)-3.829102649
Kurtosis-0.03436963598
Mean-0.5183207529
Median Absolute Deviation (MAD)1.285863659
Skewness-0.07150994541
Sum-518.3207529
Variance3.939047458
MonotonicityNot monotonic
2022-11-22T14:51:16.901590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.80061495571
 
0.1%
-0.41054292741
 
0.1%
-2.1491909771
 
0.1%
0.66916116741
 
0.1%
-2.4965973321
 
0.1%
-3.4685630121
 
0.1%
0.015554966171
 
0.1%
0.33058000811
 
0.1%
1.5508391461
 
0.1%
0.95215213581
 
0.1%
Other values (990)990
99.0%
ValueCountFrequency (%)
-6.7581763831
0.1%
-6.3997435691
0.1%
-6.3359524281
0.1%
-6.1787523341
0.1%
-5.8563288221
0.1%
-5.8193387591
0.1%
-5.7190500181
0.1%
-5.4730478091
0.1%
-5.2715855021
0.1%
-5.1665747551
0.1%
ValueCountFrequency (%)
5.2897087821
0.1%
4.8074814571
0.1%
4.6989834111
0.1%
4.6969804641
0.1%
4.6288186011
0.1%
4.5807372481
0.1%
4.4662102251
0.1%
4.2456769571
0.1%
3.9712055371
0.1%
3.963994221
0.1%

feat.f
Real number (ℝ)

HIGH CORRELATION
UNIQUE

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-6.257345469
Minimum-31.09907622
Maximum21.56793583
Zeros0
Zeros (%)0.0%
Negative789
Negative (%)78.9%
Memory size7.9 KiB
2022-11-22T14:51:16.996092image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-31.09907622
5-th percentile-19.48796709
Q1-11.65484815
median-6.262428864
Q3-0.912532982
95-th percentile6.307771472
Maximum21.56793583
Range52.66701205
Interquartile range (IQR)10.74231517

Descriptive statistics

Standard deviation8.005529545
Coefficient of variation (CV)-1.279381103
Kurtosis0.1736241289
Mean-6.257345469
Median Absolute Deviation (MAD)5.371094591
Skewness0.02786552246
Sum-6257.345469
Variance64.0885033
MonotonicityNot monotonic
2022-11-22T14:51:17.084591image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-4.4276017881
 
0.1%
5.7338683131
 
0.1%
-16.235341371
 
0.1%
-4.5279111151
 
0.1%
-11.992206131
 
0.1%
-10.865188251
 
0.1%
-2.6410910021
 
0.1%
0.73479836511
 
0.1%
-2.9587445061
 
0.1%
-10.278754641
 
0.1%
Other values (990)990
99.0%
ValueCountFrequency (%)
-31.099076221
0.1%
-30.345078741
0.1%
-29.101038631
0.1%
-28.744143161
0.1%
-28.028870031
0.1%
-25.933495851
0.1%
-25.908219451
0.1%
-25.611928081
0.1%
-25.588970831
0.1%
-25.035811091
0.1%
ValueCountFrequency (%)
21.567935831
0.1%
20.174262011
0.1%
19.884342531
0.1%
17.932202621
0.1%
17.772680271
0.1%
17.39059161
0.1%
14.179184561
0.1%
14.014120771
0.1%
13.320451141
0.1%
13.043309091
0.1%

feat.g
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
z
341 
x
330 
y
329 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowz
2nd rowx
3rd rowy
4th rowy
5th rowz

Common Values

ValueCountFrequency (%)
z341
34.1%
x330
33.0%
y329
32.9%

Length

2022-11-22T14:51:17.158589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-22T14:51:17.222589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
z341
34.1%
x330
33.0%
y329
32.9%

Most occurring characters

ValueCountFrequency (%)
z341
34.1%
x330
33.0%
y329
32.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
z341
34.1%
x330
33.0%
y329
32.9%

Most occurring scripts

ValueCountFrequency (%)
Latin1000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
z341
34.1%
x330
33.0%
y329
32.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
z341
34.1%
x330
33.0%
y329
32.9%

feat.h
Real number (ℝ≥0)

UNIQUE

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.03083308
Minimum3.421248303
Maximum17.43144145
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2022-11-22T14:51:17.291091image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum3.421248303
5-th percentile6.692429352
Q18.700144228
median10.02830899
Q311.52875894
95-th percentile13.25505024
Maximum17.43144145
Range14.01019315
Interquartile range (IQR)2.828614715

Descriptive statistics

Standard deviation2.022200156
Coefficient of variation (CV)0.2015984257
Kurtosis0.03984029824
Mean10.03083308
Median Absolute Deviation (MAD)1.421113657
Skewness-0.1302689443
Sum10030.83308
Variance4.089293472
MonotonicityNot monotonic
2022-11-22T14:51:17.380092image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10.254198871
 
0.1%
6.7543034271
 
0.1%
11.242786191
 
0.1%
12.682988481
 
0.1%
9.3523765561
 
0.1%
10.288766941
 
0.1%
7.8362944321
 
0.1%
9.9880453161
 
0.1%
9.2779886321
 
0.1%
8.4933109241
 
0.1%
Other values (990)990
99.0%
ValueCountFrequency (%)
3.4212483031
0.1%
3.4757013311
0.1%
3.9459085621
0.1%
4.1514177361
0.1%
4.227701751
0.1%
4.516772241
0.1%
4.6143975021
0.1%
4.862270611
0.1%
5.0221927721
0.1%
5.2175520221
0.1%
ValueCountFrequency (%)
17.431441451
0.1%
16.167479081
0.1%
15.857801481
0.1%
15.697488071
0.1%
14.834141221
0.1%
14.790402651
0.1%
14.623962921
0.1%
14.623638891
0.1%
14.488327261
0.1%
14.476326451
0.1%

feat.i
Real number (ℝ)

HIGH CORRELATION
UNIQUE

Distinct1000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.5186066973
Minimum-6.763426764
Maximum5.315728559
Zeros0
Zeros (%)0.0%
Negative598
Negative (%)59.8%
Memory size7.9 KiB
2022-11-22T14:51:17.465089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-6.763426764
5-th percentile-3.886815871
Q1-1.773089044
median-0.5060849539
Q30.8030406826
95-th percentile2.78705719
Maximum5.315728559
Range12.07915532
Interquartile range (IQR)2.576129727

Descriptive statistics

Standard deviation1.984378137
Coefficient of variation (CV)-3.826364271
Kurtosis-0.02932702066
Mean-0.5186066973
Median Absolute Deviation (MAD)1.292620753
Skewness-0.07130431438
Sum-518.6066973
Variance3.937756592
MonotonicityNot monotonic
2022-11-22T14:51:17.547591image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.82807286971
 
0.1%
-0.33587886621
 
0.1%
-2.1387549691
 
0.1%
0.75678816971
 
0.1%
-2.5055393611
 
0.1%
-3.4908921391
 
0.1%
0.070549327411
 
0.1%
0.36395920211
 
0.1%
1.6075694571
 
0.1%
0.97584825671
 
0.1%
Other values (990)990
99.0%
ValueCountFrequency (%)
-6.7634267641
0.1%
-6.3987461961
0.1%
-6.3762774091
0.1%
-6.1923099231
0.1%
-5.8456334141
0.1%
-5.829952191
0.1%
-5.7561982921
0.1%
-5.5224548281
0.1%
-5.2151543721
0.1%
-5.1740469741
0.1%
ValueCountFrequency (%)
5.3157285591
0.1%
4.8429657361
0.1%
4.7159034841
0.1%
4.6462573291
0.1%
4.5883745311
0.1%
4.5500446821
0.1%
4.4465088741
0.1%
4.2480040111
0.1%
3.9661866681
0.1%
3.9470042531
0.1%

Interactions

2022-11-22T14:51:14.997589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:12.349588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:12.892590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.393588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.985589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.475094image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:15.082091image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:12.456089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:12.976588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.477089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.071089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.565589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:15.161589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:12.541590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.055089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.553588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.147588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.646588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:15.241588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:12.625589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.142089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.645590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.224590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.736089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:15.326589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:12.706588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.221088image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.727090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.309588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.828090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:15.411590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:12.804591image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.314089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:13.905590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.393091image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-22T14:51:14.914590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-11-22T14:51:17.633590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-22T14:51:17.724089image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-22T14:51:17.817091image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-22T14:51:17.906088image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-22T14:51:17.995591image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-22T14:51:18.073090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-22T14:51:15.564590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-22T14:51:15.694589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-11-22T14:51:15.786090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-11-22T14:51:15.842589image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

responsefeat.afeat.bfeat.cfeat.dfeat.efeat.ffeat.gfeat.hfeat.i
01-0.681427-5.493698b0.0-0.800615-4.427602z10.254199-0.828073
110.309468-5.559933d1.0-1.155514-0.799094x9.084749-1.109698
215.676125-4.026970b1.0-3.396331-0.631966y8.753848-3.417417
311.211525-4.198263a1.0-1.894569-16.273262y12.191295-1.904801
411.387863-7.824014c1.04.696980-22.208877z9.6266864.715903
516.145195-2.439140c0.0-0.57483011.642609y12.362962-0.521423
612.382749-3.625411a0.01.326984-4.148881z9.2261221.287618
71-2.795184-0.375689c1.0-0.869053-2.994862x7.973038-0.839326
80-1.060559-2.972203b0.00.719649-15.543748z12.8931240.718503
91-0.336986-4.670439b1.0-0.6054543.060399y9.803020-0.548610

Last rows

responsefeat.afeat.bfeat.cfeat.dfeat.efeat.ffeat.gfeat.hfeat.i
99013.027287-5.645709b1.0-4.847993-8.070246x12.043355-4.894286
9911-2.222620-2.611733a0.00.735233-2.876741x10.0907260.816545
99202.363733-3.629801b0.0-5.109591-7.578162y9.301541-5.119936
99300.360079-5.105157d0.0-1.393937-21.575596y10.537327-1.397865
99401.939686-5.920013b0.00.098981-17.421105z11.7055790.084855
99500.730074-3.885035b0.0-3.356949-12.803344z11.204110-3.396673
99614.211548-3.617253a0.02.0349956.995753z9.2080892.069752
9971-3.053301-3.583830c1.01.929012-7.013105z7.6378621.856356
9981-0.567850-3.194716c1.0-1.8497124.204816z11.725868-1.862466
99910.252428-4.690728d1.01.742044-4.564031y7.9097091.747037